Power Function Distribution (powerlaw)#

The power function distribution (called powerlaw in SciPy) is a simple one-parameter family of continuous distributions on \([0,1]\) with CDF

\[F(x)=x^a.\]

Equivalently,

\[X \sim \texttt{powerlaw}(a) \quad\Longleftrightarrow\quad X \sim \mathrm{Beta}(a,1).\]

It’s useful whenever you need a flexible model for a bounded proportion that can concentrate near 0 or near 1. It also appears naturally in order statistics: the maximum of \(n\) i.i.d. Uniform\((0,1)\) variables has powerlaw distribution with \(a=n\).

What you’ll learn#

  • classification, support, and parameterization (including SciPy’s loc/scale)

  • PDF/CDF/PPF and connections to Beta and order statistics

  • moments (mean/variance/skewness/kurtosis), MGF/CF, and entropy

  • how the shape parameter \(a\) changes the distribution

  • derivations: expectation, variance, likelihood, and the MLE for \(a\)

  • NumPy-only sampling via inverse CDF + Monte Carlo validation

  • practical usage via scipy.stats.powerlaw (pdf, cdf, rvs, fit)

  • hypothesis testing, Bayesian modeling, and generative modeling patterns

Notebook roadmap#

  1. Title & classification

  2. Intuition & motivation

  3. Formal definition (PDF/CDF)

  4. Moments & properties

  5. Parameter interpretation

  6. Derivations ((\mathbb{E}[X]), (\mathrm{Var}(X)), likelihood)

  7. Sampling & simulation (NumPy-only)

  8. Visualization (PDF, CDF, Monte Carlo)

  9. SciPy integration (scipy.stats.powerlaw)

  10. Statistical use cases

  11. Pitfalls

  12. Summary

import math

import numpy as np
import scipy
from scipy import special, stats

import plotly
import plotly.express as px
import plotly.graph_objects as go
import os
import plotly.io as pio

# Plotly rendering (CKC convention)
pio.templates.default = "plotly_white"
pio.renderers.default = os.environ.get("PLOTLY_RENDERER", "notebook")

# Reproducibility
SEED = 7
rng = np.random.default_rng(SEED)

np.set_printoptions(precision=4, suppress=True)

print("numpy ", np.__version__)
print("scipy ", scipy.__version__)
print("plotly", plotly.__version__)
numpy  1.26.2
scipy  1.15.0
plotly 6.5.2

Prerequisites & notation#

Prerequisites

  • comfort with basic calculus (single-variable integration)

  • basic probability (PDF/CDF, expectation, variance)

  • basic statistics (likelihood, MLE)

Important terminology

  • In SciPy, powerlaw is the power function distribution on a bounded interval.

  • It is not the heavy-tailed “power-law” distribution often used for wealth/city sizes (see scipy.stats.pareto, scipy.stats.zipf, etc.).

SciPy parameterization SciPy uses a location-scale family:

  • scipy.stats.powerlaw(a, loc=ℓ, scale=s) has support (x \in [\ell,, \ell+s]) with (s>0).

  • The canonical form in this notebook uses (\ell=0), (s=1).

1) Title & Classification#

  • Name: powerlaw (power function distribution)

  • Type: Continuous

  • Support (canonical): (x \in [0,1]) (density is defined for (0<x<1))

  • Parameter space (canonical): shape (a>0)

With SciPy’s location-scale parameters:

  • Support: (x \in [\ell, \ell+s])

  • Parameters: (a>0), (\ell \in \mathbb{R}), (s>0)

We write:

\[X \sim \texttt{powerlaw}(a) \equiv \mathrm{Beta}(a,1).\]

2) Intuition & Motivation#

2.1 What it models#

powerlaw models a bounded random variable on ([0,1]) whose probability mass can be “tilted” toward 0 or toward 1 with a single shape parameter (a).

A useful way to think about it is via the transformation:

  • If (U \sim \mathrm{Uniform}(0,1)), then $\(X = U^{1/a} \sim \texttt{powerlaw}(a).\)$

So (a) controls how aggressively the uniform is “stretched” toward 0 ((a<1)) or toward 1 ((a>1)).

2.2 Real-world use cases#

  • Proportions that lean toward 0 or 1: completion rates, saturation fractions, normalized scores

  • Order statistics: maxima of uniforms (e.g., “best-of-(n)” effects)

  • P-values under alternatives (simple model): p-values often concentrate near 0; (\mathrm{Beta}(a,1)) with (a<1) is a common approximation

  • Simple generative noise on ([0,1]): random thresholds, random quantiles

2.3 Relations to other distributions#

  • Beta: powerlaw(a) is exactly (\mathrm{Beta}(a,1)).

  • Uniform: (a=1) gives (f(x)=1) on ([0,1]).

  • Exponential (via log transform): if (X \sim \texttt{powerlaw}(a)), then $\(-\log X \sim \mathrm{Exponential}(\text{rate}=a).\)$

  • Order statistic: if (U_1,\dots,U_n) are i.i.d. Uniform((0,1)), then (\max_i U_i \sim \texttt{powerlaw}(n)).

3) Formal Definition#

3.1 Canonical PDF#

For (a>0) and (0<x<1):

\[f(x\mid a)= a\,x^{a-1}.\]

3.2 Canonical CDF#

The CDF is

\[\begin{split}F(x\mid a)=\begin{cases} 0, & x\le 0\\ x^a, & 0<x<1\\ 1, & x\ge 1\,. \end{cases}\end{split}\]

3.3 Quantile function (inverse CDF)#

For (u\in(0,1)):

\[F^{-1}(u)=u^{1/a}.\]

This yields a simple inverse-transform sampler.

3.4 Location-scale form (SciPy)#

If (X\sim\texttt{powerlaw}(a)) on ([0,1]) and we define (Y = \ell + sX) with (s>0), then (Y) has support ([\ell,\ell+s]) and

\[f_Y(y\mid a,\ell,s)=\frac{a}{s}\left(\frac{y-\ell}{s}\right)^{a-1},\quad \ell<y<\ell+s,\]
\[F_Y(y\mid a,\ell,s)=\left(\frac{y-\ell}{s}\right)^a,\quad \ell<y<\ell+s.\]
def powerlaw_pdf(x: np.ndarray, a: float, loc: float = 0.0, scale: float = 1.0) -> np.ndarray:
    '''Powerlaw (power function) PDF with SciPy-style loc/scale.

    Canonical form corresponds to loc=0, scale=1.

    Notes
    -----
    For a < 1, the density diverges as x -> loc (from the right).
    '''

    if a <= 0:
        raise ValueError("a must be > 0")
    if scale <= 0:
        raise ValueError("scale must be > 0")

    x = np.asarray(x, dtype=float)
    z = (x - loc) / scale

    out = np.zeros_like(z, dtype=float)
    mask = (z > 0) & (z < 1)

    # Compute in log-space for numerical stability
    log_pdf = (math.log(a) - math.log(scale)) + (a - 1.0) * np.log(z[mask])
    out[mask] = np.exp(log_pdf)
    return out


def powerlaw_cdf(x: np.ndarray, a: float, loc: float = 0.0, scale: float = 1.0) -> np.ndarray:
    '''Powerlaw CDF with SciPy-style loc/scale.'''

    if a <= 0:
        raise ValueError("a must be > 0")
    if scale <= 0:
        raise ValueError("scale must be > 0")

    x = np.asarray(x, dtype=float)
    z = (x - loc) / scale

    out = np.zeros_like(z, dtype=float)
    out[z >= 1.0] = 1.0

    mask = (z > 0) & (z < 1)
    out[mask] = np.power(z[mask], a)
    return out


def powerlaw_ppf(u: np.ndarray, a: float, loc: float = 0.0, scale: float = 1.0) -> np.ndarray:
    '''Powerlaw quantile function (inverse CDF).'''

    if a <= 0:
        raise ValueError("a must be > 0")
    if scale <= 0:
        raise ValueError("scale must be > 0")

    u = np.asarray(u, dtype=float)
    if np.any((u < 0) | (u > 1)):
        raise ValueError("u must be in [0, 1]")

    return loc + scale * np.power(u, 1.0 / a)


# Quick sanity check: PDF integrates to ~1 on [0,1]
a0 = 2.5
xgrid = np.linspace(1e-6, 1 - 1e-6, 400_000)
area = np.trapz(powerlaw_pdf(xgrid, a0), xgrid)
area
0.9999975000038249

4) Moments & Properties#

A convenient closed form exists for raw moments. For (k>-a):

\[\mathbb{E}[X^k] = \int_0^1 x^k\,a x^{a-1}\,dx = \frac{a}{a+k}.\]

4.1 Mean and variance#

Using (k=1) and (k=2):

\[\mathbb{E}[X] = \frac{a}{a+1},\qquad \mathbb{E}[X^2] = \frac{a}{a+2},\]
\[\mathrm{Var}(X) = \mathbb{E}[X^2] - \mathbb{E}[X]^2 = \frac{a}{(a+1)^2(a+2)}.\]

4.2 Skewness and kurtosis#

Because powerlaw(a) is (\mathrm{Beta}(a,1)), we can reuse standard beta formulas:

\[\gamma_1 = \frac{2(1-a)\sqrt{a+2}}{(a+3)\sqrt{a}},\]
\[\gamma_2\;\text{(excess)} = \frac{6\big((a-1)^2(a+2)-a(a+3)\big)}{a(a+3)(a+4)}.\]

4.3 MGF and characteristic function#

The MGF exists for all real (t) (bounded support). A compact special-function expression is

\[M_X(t)=\mathbb{E}[e^{tX}] = {}_1F_1\big(a; a+1; t\big),\]

where ({}_1F_1) is the confluent hypergeometric function. The characteristic function is

\[\varphi_X(\omega)=\mathbb{E}[e^{i\omega X}] = {}_1F_1\big(a; a+1; i\omega\big).\]

4.4 Entropy#

The differential entropy (in nats) has a simple closed form:

\[H(X)=1-\frac{1}{a}-\log a.\]

4.5 Mode (shape intuition)#

  • If (a>1), the density increases on ((0,1)) and the mode is at (x=1).

  • If (a=1), the distribution is uniform.

  • If (0<a<1), the density decreases and diverges at 0 (mode at (x=0)).

def powerlaw_moments(a: float) -> dict:
    '''Key moments and summary properties for the canonical powerlaw(a) on [0,1].'''

    if a <= 0:
        raise ValueError("a must be > 0")

    mean = a / (a + 1.0)
    var = a / ((a + 1.0) ** 2 * (a + 2.0))

    skew = (2.0 * (1.0 - a) * math.sqrt(a + 2.0)) / ((a + 3.0) * math.sqrt(a))

    excess_kurt = (6.0 * (((a - 1.0) ** 2) * (a + 2.0) - a * (a + 3.0))) / (a * (a + 3.0) * (a + 4.0))
    kurt = excess_kurt + 3.0

    entropy = 1.0 - 1.0 / a - math.log(a)

    return {
        "mean": mean,
        "variance": var,
        "skewness": skew,
        "kurtosis": kurt,
        "excess_kurtosis": excess_kurt,
        "entropy_nats": entropy,
    }


def powerlaw_mgf(t: np.ndarray, a: float) -> np.ndarray:
    '''MGF M(t) = E[e^{tX}] using SciPy's confluent hypergeometric 1F1.'''

    if a <= 0:
        raise ValueError("a must be > 0")

    t = np.asarray(t)
    return special.hyp1f1(a, a + 1.0, t)


def powerlaw_cf(w: np.ndarray, a: float) -> np.ndarray:
    '''Characteristic function phi(w) = E[e^{i w X}].'''

    if a <= 0:
        raise ValueError("a must be > 0")

    w = np.asarray(w)
    return special.hyp1f1(a, a + 1.0, 1j * w)


# Spot-check: mean/variance via Monte Carlo and an MGF identity

a0 = 2.5
n = 200_000
u = rng.random(n)
# avoid exact 0 for stability in downstream logs
u = np.clip(u, np.finfo(float).tiny, 1.0)
samples = u ** (1.0 / a0)

mom = powerlaw_moments(a0)

mc_mean = samples.mean()
mc_var = samples.var(ddof=0)

# MGF check: E[e^{tX}] at t=1.2

t = 1.2
mc_mgf = np.mean(np.exp(t * samples))
theory_mgf = float(powerlaw_mgf(t, a0))

mom, (mc_mean, mc_var, mc_mgf, theory_mgf)
({'mean': 0.7142857142857143,
  'variance': 0.045351473922902494,
  'skewness': -0.7318040653635675,
  'kurtosis': 2.7566433566433566,
  'excess_kurtosis': -0.24335664335664337,
  'entropy_nats': -0.31629073187415513},
 (0.7148571686605475,
  0.045105947985351946,
  2.430906200738553,
  2.429630435055177))

5) Parameter Interpretation#

powerlaw has a single shape parameter (a).

5.1 What (a) does#

  • (0<a<1): mass concentrates near 0; the PDF diverges at 0.

  • (a=1): uniform on ([0,1]).

  • (a>1): mass concentrates near 1; the PDF increases toward 1.

5.2 Mean and median as functions of (a)#

  • Mean: (\mathbb{E}[X]=\frac{a}{a+1}) (increases toward 1 as (a\to\infty))

  • Median: solve (F(m)=1/2) gives (m = 2^{-1/a})

These simple formulas make it easy to interpret fitted (a) values.

# Shape changes: PDF and CDF for different a

a_values = [0.3, 0.7, 1.0, 2.0, 5.0]
x = np.linspace(1e-4, 1 - 1e-4, 500)

fig_pdf = go.Figure()
fig_cdf = go.Figure()

for a in a_values:
    fig_pdf.add_trace(go.Scatter(x=x, y=powerlaw_pdf(x, a), mode="lines", name=f"a={a:g}"))
    fig_cdf.add_trace(go.Scatter(x=x, y=powerlaw_cdf(x, a), mode="lines", name=f"a={a:g}"))

fig_pdf.update_layout(
    title="Powerlaw PDF (canonical on [0,1])",
    xaxis_title="x",
    yaxis_title="f(x)",
    yaxis_type="log",
)
fig_cdf.update_layout(title="Powerlaw CDF", xaxis_title="x", yaxis_title="F(x)")

fig_pdf.show()
fig_cdf.show()

# Mean as a function of a
agrid = np.linspace(0.2, 8, 300)
mean_grid = agrid / (agrid + 1)

fig_mean = go.Figure()
fig_mean.add_trace(go.Scatter(x=agrid, y=mean_grid, mode="lines"))
fig_mean.update_layout(title="Mean E[X] = a/(a+1)", xaxis_title="a", yaxis_title="E[X]")
fig_mean.show()

6) Derivations#

6.1 Expectation#

Starting from the PDF (f(x\mid a)=a x^{a-1}) on ((0,1)):

(15)#\[\begin{align} \mathbb{E}[X] &= \int_0^1 x\, a x^{a-1}\,dx \\ &= a\int_0^1 x^{a}\,dx \\ &= a\left[\frac{x^{a+1}}{a+1}\right]_0^1 \\ &= \frac{a}{a+1}. \end{align}\]

More generally, for (k>-a):

(16)#\[\begin{align} \mathbb{E}[X^k] &= \int_0^1 x^k\, a x^{a-1}\,dx = a\int_0^1 x^{a+k-1}\,dx = \frac{a}{a+k}. \end{align}\]

6.2 Variance#

Compute (\mathbb{E}[X^2]=\frac{a}{a+2}) and subtract the squared mean:

(17)#\[\begin{align} \mathrm{Var}(X) &= \mathbb{E}[X^2] - \mathbb{E}[X]^2 \\ &= \frac{a}{a+2} - \left(\frac{a}{a+1}\right)^2 = \frac{a}{(a+1)^2(a+2)}. \end{align}\]

6.3 Likelihood and MLE#

For i.i.d. data (x_1,\dots,x_n\in(0,1)), the likelihood is

\[L(a) = \prod_{i=1}^n a x_i^{a-1} = a^n\,\exp\Big((a-1)\sum_{i=1}^n \log x_i\Big).\]

The log-likelihood is

\[\ell(a)= n\log a + (a-1)\sum_{i=1}^n \log x_i.\]

Differentiate and set to zero:

\[\ell'(a)=\frac{n}{a} + \sum_{i=1}^n \log x_i = 0\quad\Rightarrow\quad \hat a = -\frac{n}{\sum_{i=1}^n \log x_i}.\]

Because (\sum \log x_i < 0) for (x_i\in(0,1)), the MLE (\hat a) is positive.

def powerlaw_loglikelihood(x: np.ndarray, a: float, loc: float = 0.0, scale: float = 1.0) -> float:
    '''Log-likelihood for i.i.d. observations under powerlaw(a, loc, scale).'''

    if a <= 0:
        return -np.inf
    if scale <= 0:
        return -np.inf

    x = np.asarray(x, dtype=float)
    z = (x - loc) / scale

    if np.any((z <= 0) | (z >= 1)):
        return -np.inf

    n = z.size
    return n * (math.log(a) - math.log(scale)) + (a - 1.0) * float(np.sum(np.log(z)))


def powerlaw_mle_shape(x: np.ndarray, loc: float = 0.0, scale: float = 1.0) -> float:
    '''Closed-form MLE for the shape a when loc/scale are known.'''

    x = np.asarray(x, dtype=float)
    z = (x - loc) / scale

    if np.any((z <= 0) | (z >= 1)):
        raise ValueError("All observations must satisfy loc < x < loc+scale")

    s = float(np.sum(np.log(z)))  # negative
    n = z.size
    return -n / s


# Demonstrate the MLE on simulated data

a_true = 2.5
n = 50_000
x = np.clip(rng.random(n), np.finfo(float).tiny, 1.0) ** (1.0 / a_true)

a_hat = powerlaw_mle_shape(x)
ll_true = powerlaw_loglikelihood(x, a_true)
ll_hat = powerlaw_loglikelihood(x, a_hat)

a_true, a_hat, (ll_true, ll_hat)
(2.5, 2.5199151026621225, (16051.629136719428, 16053.198881460783))

7) Sampling & Simulation#

Because the CDF is (F(x)=x^a), inverse transform sampling is immediate.

7.1 Inverse-transform algorithm (NumPy-only)#

  1. Sample (U\sim\mathrm{Uniform}(0,1))

  2. Return (X = U^{1/a})

This works because:

\[\mathbb{P}(U^{1/a} \le x) = \mathbb{P}(U \le x^a) = x^a.\]

7.2 Alternative view (log transform)#

If (X\sim\texttt{powerlaw}(a)), then (-\log X\sim\mathrm{Exponential}(\text{rate}=a)). This can be useful for diagnostics and for building hierarchical models.

def powerlaw_rvs_numpy(a: float, size: int, rng: np.random.Generator, loc: float = 0.0, scale: float = 1.0) -> np.ndarray:
    '''Sample from powerlaw(a, loc, scale) using NumPy-only inverse CDF.'''

    if a <= 0:
        raise ValueError("a must be > 0")
    if scale <= 0:
        raise ValueError("scale must be > 0")

    u = rng.random(size)

    # u can be exactly 0 with tiny probability (finite-precision RNG);
    # clipping avoids -inf if you later take logs of samples.
    u = np.clip(u, np.finfo(float).tiny, 1.0)

    x = u ** (1.0 / a)
    return loc + scale * x


# Monte Carlo validation

a0 = 0.6
n = 200_000
samples = powerlaw_rvs_numpy(a0, n, rng)

mom = powerlaw_moments(a0)

mc = {
    "mean": samples.mean(),
    "variance": samples.var(ddof=0),
    "skewness": stats.skew(samples, bias=True),
    "excess_kurtosis": stats.kurtosis(samples, fisher=True, bias=True),
}

mom, mc
({'mean': 0.37499999999999994,
  'variance': 0.09014423076923074,
  'skewness': 0.4625924443258073,
  'kurtosis': 1.9468599033816425,
  'excess_kurtosis': -1.0531400966183575,
  'entropy_nats': -0.15584104290067602},
 {'mean': 0.3743840717431062,
  'variance': 0.09018237402830535,
  'skewness': 0.4607797772203317,
  'excess_kurtosis': -1.0573167628324114})

8) Visualization#

We’ll visualize:

  • the PDF for multiple values of a

  • the CDF and an empirical CDF from Monte Carlo samples

  • a histogram of Monte Carlo samples with the theoretical PDF overlay

# Histogram + theoretical PDF overlay

a = 0.6
n = 50_000
samples = powerlaw_rvs_numpy(a, n, rng)

x_grid = np.linspace(1e-4, 1 - 1e-4, 600)

fig = go.Figure()
fig.add_trace(
    go.Histogram(
        x=samples,
        nbinsx=80,
        histnorm="probability density",
        name="Monte Carlo",
        opacity=0.6,
    )
)
fig.add_trace(go.Scatter(x=x_grid, y=powerlaw_pdf(x_grid, a), mode="lines", name="theoretical PDF"))

fig.update_layout(
    title=f"Monte Carlo samples vs theoretical PDF (n={n}, a={a:g})",
    xaxis_title="x",
    yaxis_title="density",
    bargap=0.02,
)
fig.show()
# Empirical CDF vs true CDF

a = 2.0
n = 30_000
samples = powerlaw_rvs_numpy(a, n, rng)

xs = np.sort(samples)
ys = np.arange(1, n + 1) / n

x_grid = np.linspace(0, 1, 400)

fig = go.Figure()
fig.add_trace(go.Scatter(x=xs, y=ys, mode="lines", name="empirical CDF"))
fig.add_trace(go.Scatter(x=x_grid, y=powerlaw_cdf(x_grid, a), mode="lines", name="true CDF"))

fig.update_layout(
    title=f"Empirical CDF vs true CDF (n={n}, a={a:g})",
    xaxis_title="x",
    yaxis_title="F(x)",
)
fig.show()

9) SciPy Integration (scipy.stats.powerlaw)#

SciPy provides scipy.stats.powerlaw, which implements the same distribution with an additional location-scale transform.

  • Canonical form: stats.powerlaw(a) has support ([0,1]).

  • Location-scale: stats.powerlaw(a, loc=ℓ, scale=s) has support ([\ell, \ell+s]).

Because powerlaw(a) is (\mathrm{Beta}(a,1)), you can also verify equivalence via scipy.stats.beta(a, 1).

from scipy.stats import beta, powerlaw

# Define a SciPy powerlaw distribution

a = 2.5
rv = powerlaw(a, loc=0.0, scale=1.0)

x_grid = np.linspace(1e-4, 1 - 1e-4, 500)

# SciPy API
pdf_scipy = rv.pdf(x_grid)
cdf_scipy = rv.cdf(x_grid)

# Compare to our implementations
max_pdf_diff = np.max(np.abs(pdf_scipy - powerlaw_pdf(x_grid, a)))
max_cdf_diff = np.max(np.abs(cdf_scipy - powerlaw_cdf(x_grid, a)))

# Sampling
samples_scipy = rv.rvs(size=50_000, random_state=rng)

# Fit (MLE). If you know data are on [0,1], fix loc/scale.
a_hat_fit, loc_hat, scale_hat = powerlaw.fit(samples_scipy, floc=0.0, fscale=1.0)

a_hat_closed = powerlaw_mle_shape(samples_scipy)

# Beta equivalence check
rv_beta = beta(a, 1.0)
max_pdf_diff_beta = np.max(np.abs(rv_beta.pdf(x_grid) - pdf_scipy))

(max_pdf_diff, max_cdf_diff, a_hat_fit, a_hat_closed, max_pdf_diff_beta)
(4.440892098500626e-16,
 0.0,
 2.495491295398117,
 2.495491295398117,
 2.6645352591003757e-15)

10) Statistical Use Cases#

10.1 Hypothesis testing#

A common simple test is whether data are uniform on ([0,1]) ((a=1)) versus a powerlaw alternative ((a\neq 1)). You can use a likelihood ratio test (LRT) or a goodness-of-fit test like KS.

10.2 Bayesian modeling#

Because the likelihood in (a) has the form

\[L(a) \propto a^n\exp\Big(a \sum\log x_i\Big),\]

a Gamma prior on (a) is conjugate.

10.3 Generative modeling#

powerlaw is a handy one-parameter generator on ([0,1]):

  • generate “best-of-(n)” maxima of uniforms (exactly)

  • generate random weights/thresholds with controllable concentration near 0 or 1

# Hypothesis test: H0 a=1 (uniform) vs H1 a free (powerlaw)

n = 5_000
a_true = 0.6
x = powerlaw_rvs_numpy(a_true, n, rng)

# Null log-likelihood (a=1)
ll_null = powerlaw_loglikelihood(x, a=1.0)

# MLE under alternative

a_hat = powerlaw_mle_shape(x)
ll_alt = powerlaw_loglikelihood(x, a=a_hat)

lrt = 2.0 * (ll_alt - ll_null)
p_value = stats.chi2.sf(lrt, df=1)

# KS test against fitted distribution
D, p_ks = stats.kstest(x, powerlaw(a_hat).cdf)

{"a_true": a_true, "a_hat": a_hat, "LRT": lrt, "p_value_chi2": p_value, "KS_D": D, "KS_p": p_ks}
{'a_true': 0.6,
 'a_hat': 0.5879063580561444,
 'LRT': 1697.6355832402796,
 'p_value_chi2': 0.0,
 'KS_D': 0.009659281640087447,
 'KS_p': 0.735458175746772}
# Bayesian update for a with a conjugate Gamma prior

# Prior: a ~ Gamma(alpha0, rate=beta0)
alpha0 = 2.0
beta0 = 1.0  # rate

n = 2_000
a_true = 2.5
x = powerlaw_rvs_numpy(a_true, n, rng)

S = float(np.sum(np.log(x)))  # negative

# Posterior: a | x ~ Gamma(alpha0 + n, rate=beta0 - S)
alpha_post = alpha0 + n
beta_post = beta0 - S

post = stats.gamma(a=alpha_post, scale=1.0 / beta_post)  # SciPy gamma uses scale = 1/rate

post_mean = post.mean()
ci_95 = post.ppf([0.025, 0.975])

# Plot posterior density

a_grid = np.linspace(post.ppf(0.001), post.ppf(0.999), 400)

fig = go.Figure()
fig.add_trace(go.Scatter(x=a_grid, y=post.pdf(a_grid), mode="lines", name="posterior"))
fig.add_vline(x=a_true, line_dash="dash", line_color="black", annotation_text="true a")

fig.update_layout(
    title="Posterior for a with Gamma prior (conjugate)",
    xaxis_title="a",
    yaxis_title="density",
)
fig.show()

{"a_true": a_true, "posterior_mean": post_mean, "ci_95": ci_95}
{'a_true': 2.5,
 'posterior_mean': 2.399059202286664,
 'ci_95': array([2.2951, 2.5053])}
# Generative modeling pattern: max of n uniforms equals powerlaw(a=n)

n = 5
m = 80_000

# Max of n uniforms
u = rng.random((m, n))
max_u = u.max(axis=1)

# Equivalent powerlaw with a=n
samples_pw = powerlaw_rvs_numpy(a=float(n), size=m, rng=rng)

# Compare empirical CDFs
xs1 = np.sort(max_u)
ys1 = np.arange(1, m + 1) / m

xs2 = np.sort(samples_pw)
ys2 = np.arange(1, m + 1) / m

fig = go.Figure()
fig.add_trace(go.Scatter(x=xs1, y=ys1, mode="lines", name="max of n uniforms"))
fig.add_trace(go.Scatter(x=xs2, y=ys2, mode="lines", name="powerlaw(a=n)"))
fig.update_layout(
    title=f"Max of n={n} uniforms vs powerlaw(a={n})",
    xaxis_title="x",
    yaxis_title="empirical CDF",
)
fig.show()

# Two-sample KS test (should not reject for large m)
ks = stats.ks_2samp(max_u, samples_pw)
ks
KstestResult(statistic=0.005637499999999962, pvalue=0.15667073167578494, statistic_location=0.9452325925703184, statistic_sign=1)

11) Pitfalls#

  • Name confusion: SciPy’s powerlaw is bounded on ([0,1]); heavy-tailed “power-law” behavior is better modeled by distributions like Pareto/Zipf.

  • Parameter constraints: (a>0), scale>0. Invalid values should raise errors.

  • Boundary values: the canonical density is defined on ((0,1)); exact 0 or 1 values make (\log x) problematic for likelihood-based methods.

    • In practice: clip values (with care) or model measurement noise.

  • Divergence at 0 for (a<1): the PDF goes to (+\infty) as (x\to 0^+). For plots/integration, start at a small (\varepsilon) like (10^{-6}).

  • Fitting with free loc/scale: if your data are already in ([0,1]), fix loc=0, scale=1 for stable estimation of (a).

12) Summary#

  • powerlaw(a) is a continuous distribution on ([0,1]) with CDF (F(x)=x^a) and PDF (f(x)=a x^{a-1}).

  • It is exactly (\mathrm{Beta}(a,1)), and for integer (a=n) it matches the distribution of (\max{U_1,\dots,U_n}) for uniforms.

  • Moments are simple: (\mathbb{E}[X]=\frac{a}{a+1}), (\mathrm{Var}(X)=\frac{a}{(a+1)^2(a+2)}), and (\mathbb{E}[X^k]=\frac{a}{a+k}).

  • Sampling is easy with inverse CDF: (X=U^{1/a}).

  • SciPy integration is straightforward with scipy.stats.powerlaw, using loc/scale for interval transforms.